Exploiting Short-Lived Values for Low-Overhead Transient Fault Recovery
نویسندگان
چکیده
CMOS downscaling trends, manifested in the use of smaller transistor feature sizes and lower supply voltages, make microprocessors more and more vulnerable to transient errors with each new technology generation. One architectural approach to detecting such errors and recovering from them is to execute two copies of the same program and then compare the results. In this paper, we propose a technique to dramatically reduce the performance and energy overhead of result verification by exploiting the observation that a large percentage of generated result values in a datapath are short-lived. Our scheme avoids verification of short-lived values without impeding transient fault detection and recovery capabilities.
منابع مشابه
Encore: Low-Cost, Fine-Grained Transient Fault Recovery
To meet an insatiable consumer demand for greater performance at less power, silicon technology has scaled to unprecedented dimensions. However, the pursuit of faster processors and longer battery life has come at the cost of device reliability. Given the rise of processor (un)reliability as a first-order design constraint, there has been a growing interest in low-cost, non-intrusive techniques...
متن کاملDesign and Analysis of Transient Fault Tolerance for Multi Core Architecture
This paper describes the software approach of fault tolerance for shared memory multi core system using PLR.PLR uses a software-centric approach transient fault tolerance which ensuring a correct software execution. This scheme is used at user space level which does not necessitate changes to the original application.PLR create a set of redundant process per application process. In this scheme ...
متن کاملA New Design of Fault Tolerant Comparator
In this paper we have presented a new design of fault tolerant comparator with a fault free hot spare. The aim of this design is to achieve a low overhead of time and area in fault tolerant comparators. We have used hot standby technique to normal operation of the system without interrupting and dynamic recovery method in fault detection and correction. The circuit is divided to smaller modules...
متن کاملInfluence of Fault Current Limiter in Voltage Drop and TRV Considering Wind Farm
Influence of distributed generation systems in the distribution systems can increase the level of short-circuit current. The effectiveness of distributed generation systems is affected by the size, location, type of distributed generation systems technology, and the methods of connecting to distribution systems. Wind turbine system is the examples of distributed generation source. Not only does...
متن کاملA Recovery-Oriented Approach for Software Fault Diagnosis in Complex Critical Systems
This paper proposes an approach to software faults diagnosis in complex fault tolerant systems, encompassing the phases of error detection, fault location, and system recovery. Errors are detected in the first phase, exploiting the operating system support. Faults are identified during the location phase, through a machine learning based approach. Then, the best recovery action is triggered onc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006